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DETAILED ACTION 



1. 



This action is in response to the amendment, filed 10/22/04. 



2. 



Claims 1-17 are currently pending. 



Claim Objections 



3. The examiner notes that independent claims 1 , 8 and 13 should have been 
marked with -Currently amended-- and not "Original claim". 

Additionally, independent claims 1, 8 and 13 containing the language "some (but 
not necessarily all)" should be - some, but not necessarily all, -- at p. 2:13, 4:15 and 



Appropriate correction is required. 

Claim Rejections - 35 USC § 102 

4. The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that 
form the basis for the rejections under this section made in this Office action: 

A person shall be entitled to a patent unless - 

(b) the invention was patented or described in a printed publication in this or a foreign country or in public 
use or on sale in this country, more than one year prior to the date of application for patent in the United 



5. Claims 1-17 are rejected under 35 U.S.C. 102(b) as being anticipated by 
Santhanam, U.S. Patent No. 5,704,053. 



5:16. 



States. 
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As per claim 1, Santhanam discloses a method for generating code to 
perform anticipatory prefetching for data references, (col. 3:47-49, "The current 
invention provides a new compiler for such a processor that facilitates efficient insertion 
of explicit data prefetch instructions into loops within application programs"), 
comprising: 

- receiving code to be executed on a computer system; analyzing the code 
to identify data references to be prefetched (col. 3:50-51, "The compiler uses ... 
analysis (techniques) to determine data prefetching requirements"), 

- inserting prefetch instructions into the code in advance of the identified 
data references (col. 3:51-53, "Analysis and explicit data cache prefetch instruction 
insertion are performed by the compiler"), 

-wherein inserting the prefetch instructions involves: 

- attempting to calculate a stride value for a given data reference 
within a loop (col. 6:3-5, "The compiler can predict (by attempting to calculate a 
stride value) which data (reference) is needed in advance for loops that access 
array elements in a regular fashion"), 

- if the stride value cannot be calculated, setting the stride value to a 
default stride value (col. 14:48-49, "(if the stride cant be calculated), then 
substitute some fixed constant, C"), 

- inserting a prefetch instruction to prefetch the given data reference 
for a subsequent loop iteration based on the stride value (col. 6:5-8, "The 
compiler can then insert prefetch instructions into loops such that array elements 
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that are likely to be needed in future loop iterations are retrieved from memory 
ahead of time"). 

-wherein the stride value is constant for some (but not necessarily all) loop 
iterations (col. 2:25-28 ."because the analysis is done at the source code level, it is 
difficult to estimate the prefetch iteration distance (PFID) (in this situation the stride 
value is constant for some but not necessarily all loop iterations), i.e. the PFID used is 
always one loop iteration (i.e. default prefetch distance)"). 

As per claim 2, the rejection of claim 1 is incorporated and further, Santhanam 
discloses allowing a system user to specify the default stride value (col. 13:39, 
"Estimating the average loop iteration latency"). 

« 

As per claim 3, the rejection of claim 1 is incorporated and further, Santhanam 
discloses that calculating the stride value involves: 

- identifying an induction variable for the stride value (col. 1 1 :23, "Identify 
simple basic loop induction variables"), 

- identifying a stride function for the stride value and calculating the stride 
value based upon the stride function and the induction variable (col. 17:54-60, "a 
net loop increment of eight, and the element size of "A" is 8-bytes, this is a large stride 
equivalence class, assuming a 32-byte cache line size (8.times.8 bytes=64 bytes)>32 
bytes"). 
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As per claim 4, the rejection of claim 1 is incorporated and further, Santhanam 
discloses that inserting the prefetch instruction based on the stride value involves: 

- calculating a prefetch cover distance by dividing a cache line size by the 
stride value (col. 15:64-67, "When the memory stride is <=cache line size, B(i) is 
considered to be in the same cluster as B(i+1), and therefore omitted for prefetch 
consideration (i.e. the prefetch cover distance is calculated based on the cache line size 
and stride value )", and col. 17:54-66, "(Because the loop has) a net loop increment of 
eight, and the element size of "A" is 8-bytes, this is a large stride equivalence class, 
assuming a 32-byte cache line size (8.times.8 bytes=64 bytes)>32 bytes. All eight 
references to " A" are placed into the same cluster because they exhibit group spatial 
locality, and no group temporal locality. The cluster leader is the reference to A[i+7], and 
the span of the cluster is 64-bytes (i.e. &A[i+7]-&A[i]). If the prefetch memory distance 
was computed earlier to be 128-bytes, i.e. corresponding to a prefetch iteration distance 
of two, it is only necessary to insert three prefetch instructions to account for the entire 
span of this 8-member cluster.") , 

- calculating a prefetch ahead distance as a function of a prefetch latency, 
the prefetch cover distance and an execution time of a loop (col. 7:1 1-18, "The 
memory address is determined based on the number of loop iterations in advance (i.e. 
the prefetch iteration distance or PFID) that data items need to be prefetched to fully 
hide the time required to service potential data cache misses. The PFID is determined 
taking into account the nature of the loop body instructions (i.e. execution time of the 
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loop and the prefetch cover distance) and characteristics of the target processor and 
memory system (i.e. the prefetch latency and prefetch cover distance)"), 

- calculating a prefetch address by multiplying the stride value by the 
prefetch cover distance and the prefetch ahead distance and adding the result to 
an address accessed by the given data reference (col. 7:1 1-18, "The memory 
address is determined based on the number of loop iterations in advance (i.e. the 
prefetch iteration distance or PFID) that data items need to be prefetched to fully hide 
the time required to service potential data cache misses. The PFID is determined taking 
into account the nature of the loop body instructions and characteristics of the target 
processor and memory system."). 

As per claim 5, the rejection of claim 1 is incorporated and further, Santhanam 
discloses that analyzing the code involves: 

- identifying loop bodies within the code; identifying data references to be 
prefetched from within the loop bodies (col. 8:30-35, "One important feature of the 
invention identifies loops and access patterns to allow a determination of how many 

* 

cycles are devoted to loop iterations, and therefore allows insertion of the prefetch 
instruction to a location of an array that is sufficiently far in advance to make sure that 
the miss time is minimized."). 

As per claim 6, the rejection of claim 5 is incorporated and further, Santhanam 
discloses that analyzing the code to identify data references to be prefetched involves 
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examining a pattern of data references over multiple loop iterations (col. 14:6-10, 
"Now, it is also necessary to address the issue of loops that have internal branches. The 
minimum loop iteration latency for such loops is estimated by using previously collected 
execution profile information, which indicates the execution count for each basic block in 
the loop body."). 

As per claim 7, the rejection of claim 1 is incorporated and further, Santhanam 
discloses that analyzing the code involves analyzing the code within a compiler (col. 
3:47-49, "The current invention provides a new compiler for such a processor that 
facilitates efficient insertion of explicit data prefetch instructions into loops within 
application programs"). 

As per claims 8-12, this is a computer readable medium/product version of the 
claimed method discussed above, in claims 1-7 , wherein all claimed limitations have 
also been addressed and/or cited as set forth above. For example, see Santhanam's 
"new compiler" (col. 3:47-49). 

As per claims 13-17, this is an apparatus version of the claimed method 
discussed above, in claims 1-7 , wherein all claimed limitations have also been 
addressed and/or cited as set forth above. For example, see Santhanam Fig. 1 
computer architecture, item 10 and associated text. 
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Response to Arguments 

6. Applicants arguments have been considered but they are not persuasive. 

In the remarks, the applicant has argued substantially that: 

1 ) Santhanam teaches away from the present invention because Santhanam does 
not optimize loops in which the stride is a constant value for some (but not necessarily 
all) loop iterations, at p. 9:13-10:10. 

Examiner's response: 

1 ) The Examiner disagrees with applicant's characterization of the applied art. 
Santhanam does not teach away from the present invention. Applicant agrees that 
Santhanam teaches optimizing loops in which the stride value is a constant value for all 
loop iterations, at p. 10:4-7 of the amendment. Santhanam also, discloses that in 
situations where the stride value is unknown (i.e. constant for some but not necessarily 
all loop iterations), optimization is performed by inserting prefetch instructions using the 
default prefetch distance (one loop iteration), at col. 2:25-28. 

Conclusion 

7. Applicant's amendment necessitated the new ground(s) of rejection presented in 
this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP 

§ 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 
CFR 1.136(a). 
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A shortened statutory period for reply to this final action is set to expire THREE 
MONTHS from the mailing date of this action. In the event a first reply is filed within 
TWO MONTHS of the mailing date of this final action and the advisory action is not 
mailed until after the end of the THREE-MONTH shortened statutory period, then the 
shortened statutory period will expire on the date the advisory action is mailed, and any 
extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of 
the advisory action. In no event, however, will the statutory period for reply expire later 
than SIX MONTHS from the date of this final action. 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Andre R. Fowlkes whose telephone number is (571 ) 
272-3697. The examiner can normally be reached on Monday - Friday, 8:00am- 
4:30pm. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Tuan Q. Dam can be reached on (571)272-3695. The fax phone number for 
the organization where this application or proceeding is assigned is 703-872-9306. 
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Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). 
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